Skip to content

Changes in pipeline to account for multiple assemblies #132

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

afg1
Copy link
Contributor

@afg1 afg1 commented Sep 15, 2022

Looked at these files (output of rg -l ensembl_assembly)

files/repeats/find-assemblies.sql
files/genome-mapping/post.sql
files/genome-mapping/find_species.sql
files/ftp-export/genome_coordinates/known-coordinates.sql
files/genome-mapping/load.ctl
files/genes/species.sql
files/genes/schema.sql
files/import-data/post-release/001__regions.sql
files/import-data/post-release/001__coordinate-systems.sql
files/import-data/post-release/001__ensembl-pseudogenes.sql
files/import-data/post-release/001__locations.sql
files/import-data/post-release/002__Cleanup_assembly_table.sql
files/import-data/ensembl/known-assemblies.sql
files/import-data/pre-release/000__assemblies.sql
workflows/databases/mirgenedb.nf
rnacentral_pipeline/databases/ensembl/metadata/assemblies.py

And chased down uses of those files. I think I got everything and it will do what we want

I have checked the workflows that use these files and I'm pretty sure the changes are ok

The full list of potentially affected files:
files/repeats/find-assemblies.sql
files/genome-mapping/post.sql
files/genome-mapping/find_species.sql
files/ftp-export/genome_coordinates/known-coordinates.sql
files/genome-mapping/load.ctl
files/genes/species.sql
files/genes/schema.sql
files/import-data/post-release/001__regions.sql
files/import-data/post-release/001__coordinate-systems.sql
files/import-data/post-release/001__ensembl-pseudogenes.sql
files/import-data/post-release/001__locations.sql
files/import-data/post-release/002__Cleanup_assembly_table.sql
files/import-data/ensembl/known-assemblies.sql
files/import-data/pre-release/000__assemblies.sql
workflows/databases/mirgenedb.nf
rnacentral_pipeline/databases/ensembl/metadata/assemblies.py

If one is not touched in this commit, then I think it didn't need modification
@blakesweeney
Copy link
Member

This looks reasonable to me, but what are we doing with the Cleanup_assembly_table script? I think we can just leave the assemblies alone since we will only use the selected_genomes in the webcode.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants